The Sample Superstore dataset is a collection of sales data from a fictitious superstore. It includes information about customer demographics, product categories, shipping details, and sales performance. The dataset allows analysis of sales patterns, customer behavior, and operational insights to improve store efficiency and profitability.
import pandas as pd
import numpy as np
import plotly.express as px
from PIL import Image
import matplotlib.pyplot as plt
import plotly
plotly.offline.init_notebook_mode()
df = pd.read_excel('Data/Sample - Superstore.xls')
df.head()
| Row ID | Order ID | Order Date | Ship Date | Ship Mode | Customer ID | Customer Name | Segment | Country/Region | City | ... | Postal Code | Region | Product ID | Category | Sub-Category | Product Name | Sales | Quantity | Discount | Profit | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | CA-2020-152156 | 2020-11-08 | 2020-11-11 | Second Class | CG-12520 | Claire Gute | Consumer | United States | Henderson | ... | 42420.0 | South | FUR-BO-10001798 | Furniture | Bookcases | Bush Somerset Collection Bookcase | 261.9600 | 2 | 0.00 | 41.9136 |
| 1 | 2 | CA-2020-152156 | 2020-11-08 | 2020-11-11 | Second Class | CG-12520 | Claire Gute | Consumer | United States | Henderson | ... | 42420.0 | South | FUR-CH-10000454 | Furniture | Chairs | Hon Deluxe Fabric Upholstered Stacking Chairs,... | 731.9400 | 3 | 0.00 | 219.5820 |
| 2 | 3 | CA-2020-138688 | 2020-06-12 | 2020-06-16 | Second Class | DV-13045 | Darrin Van Huff | Corporate | United States | Los Angeles | ... | 90036.0 | West | OFF-LA-10000240 | Office Supplies | Labels | Self-Adhesive Address Labels for Typewriters b... | 14.6200 | 2 | 0.00 | 6.8714 |
| 3 | 4 | US-2019-108966 | 2019-10-11 | 2019-10-18 | Standard Class | SO-20335 | Sean O'Donnell | Consumer | United States | Fort Lauderdale | ... | 33311.0 | South | FUR-TA-10000577 | Furniture | Tables | Bretford CR4500 Series Slim Rectangular Table | 957.5775 | 5 | 0.45 | -383.0310 |
| 4 | 5 | US-2019-108966 | 2019-10-11 | 2019-10-18 | Standard Class | SO-20335 | Sean O'Donnell | Consumer | United States | Fort Lauderdale | ... | 33311.0 | South | OFF-ST-10000760 | Office Supplies | Storage | Eldon Fold 'N Roll Cart System | 22.3680 | 2 | 0.20 | 2.5164 |
5 rows × 21 columns
This scatter plot shows sales of various products under different Product Categories
fig = px.scatter(df,x="Sales", y="Profit",color='Category')
fig.show()
| Name | Roll Number |
|---|---|
| Sidhartha S Mondal | 101436978 |
| Bharat Sharma | 101444431 |
| Chirag Bhatia | 101441822 |
$$ y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \ldots + \beta_n x_n + \varepsilon\ $$
img = Image.open('Img/Image2.jpeg')
fig = plt.imshow(img)
plt.axis('off')
fig.axes.get_xaxis().set_visible(False)
fig.axes.get_yaxis().set_visible(False)